Gibbs entropy of network ensembles by cavity methods 
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The Gibbs entropy of a microcanonical network ensemble is the logarithm of the number of network 
configurations compatible with a set of hard constraints. This quantity characterizes the level of 
order and randomness encoded in features of a given real network. Here we show how to relate 
this entropy to large deviations of conjugated canonical ensembles. We derive exact expression for 
this correspondence using the cavity methods for the configuration model, for the ensembles with 
contraint degree sequence and community structure and for the ensemble with constraint degree 
sequence and number of lins at a given distance. 

PACS numbers: 5.90.+m,89.75.Hc,89.75.Fb 
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The evolution of complex networks are usually de- 
scribed by non-equilibrium stochastic dynamics 
However a networks' specific topological structure may 
reveal relevant organizational principles, such as an uni- 
versality for the large-scale structure or hierarchical com- 
munities @ that is sure to impact dynamical processes 
taking place on the network 0, HJ . 

Extracting relevant statistical information encoded in 
the networks' structure is a fundamental concern of com- 
munity detection algorithms @ and other inference prob- 
lems. To study these problems, several authors have sug- 
gested entropy based methods [ij-fllT] , which are g rounded 
in the information theory of networks |lll - ll6j . These 
methods have proved to be very useful. In fact, in a se- 
ries of recent papers [TT| - tl9j it has been shown that one 
may extend ideas and concepts of statistical mechanics 
and information theory to complex network ensembles. 

In this paradigm, one generalizes the typical random 
gra ph ensembles studied in the mathematical literature 
[201 to ensembles characterized by an extensive number 
of constraints that fix, for example, the degree sequence 
plj ]. number of links between different communities, the 
number of links at a given distance [l2|, [H| , degree corre- 
lations between linked nodes acyclic networks [l7j j . 
or even network with given number of triangles [l8| and 
generalized motifs jl9j |. 

It is well known that in statistical mechanics we dis- 
tinguish between microcanonical ensembles describing all 
the set of microscopic configurations compatible with a 
given value of the total energy, and canonical ensembles 
that corresponds to microscopical configurations in which 
the total energy fluctuates around a given mean. A piv- 
otal results of statistical mechanics is the equivalence of 
these ensembles in the thermodynamic limit, i.e., in the 
limit where the number of particle in the system is very 
large. Similarly, in the theory of random graphs we dis- 
tinguish between the G(N, L) ensemble, which consists 
of all networks with N nodes and a total of exactly L 
links, and the G(N,p) ensemble, which is formed by all 
networks of N nodes and the total number of links be- 
ing a Poisson distributed random variable with average 
(L) = p(N — 1). Exploiting the parallelism between sta- 
tistical mechanics and theory of random graphs we can 



call the random graph ensemble G(N, L) a microcanon- 
ical network ensembles and the G{N,p) graph ensem- 
ble a canonical network ensemble. Similarly to statis- 
tical mechanics, the random graph ensembles G(N, L) 
and G(N,p) are, in the thermodynamic limit asymptot- 
ically equivalent as long as L of the G(N,L) ensemble 
and p of the G(N,p) ensemble are related by the equal- 
ityL=p(N-l) 

It was shown in [12|, [ijj, [l5j that the parallel construc- 
tion between network ensembles can be extended to much 
more complex networks. In fact it is possible to define 
microcanonical network ensembles by imposing a set of 
hard constraints that must be satisfied by each network 
in the ensemble and canonical network ensembles, which 
satisfy soft constraints, i.e., the constraints are satisfied 
on average. The set of constraints might fix for exam- 
ple the degree sequence, the community structure or the 
spatial structure of networks embedded in space. 

A widely studied example of the microcanonical net- 
work ensemble is the configuration model [2l| that fixes 
the degree sequence, i.e., degrees for all nodes in the net- 
works. On the other hand, canonical network ensem- 
bles that impose soft contraints on the degree sequence 
have been studied under different names ("hidden vari- 
able model" and "fitness model" ) by the physics |22h"2"3 | 
and statistics [25[ communities. 

In a recent work [l5[ it has been shown that if the 
number of constraints is extensive the microcanonical 
ensemble and its' conjugate canonical ensemble are no 
longer equivalent. In particular, using a network entropy 
measure, it was shown that a microcanonical ensemble 
has lower entropy than the conjugate canonical ensemble, 
even though the marginal probabilities take the same ex- 
pression. An example of this difference was given by com- 
paring the microcanonical ensemble of regular networks 
with fixed degree fcj = c £ N for all nodes i = l,...,N 
and the canonical Poisson network ensemble with aver- 
age degree fc, = c, for every i = 1, 2, . . . , N, where the 
bar refers to the ensemble average. It is easy to check 
that in this paradigmatic case, the entropy of the regular 
networks is smaller than the entropy of the Poisson net- 
works with the same average degree. The importance of 
such topological difference is also revealed by the obser- 
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vation that dynamical models defined on microcanonical 
network ensembles or corresponding canonical ones, dis- 
play different critical behavior. 

The calculation for the entropy of arbitrary micro- 
canonical ensembles was performed in [l2l Il3j using a 
Gaussian approximation and in flil [l6| by exact path in- 
tegral approaches restricted to sparse networks and con- 
straint degree sequence. Here we show an extension of the 
exact results found in flil [l6j using the more transpar- 
ent cavity method [26|, |27| | and derive the correspondence 
between the entropies of micro-canonical and conjugate 
canonical ensembles. 



I. ENTROPY OF SIMPLE CANONICAL 
NETWORK ENSEMBLES 

We first consider a canonical ensemble of simple net- 
works, each consisting of N nodes and characterized by 
an adjacency matrix {a} £ {0,l} NxN . A link between 
two nodes i and j may be present (<Zjj = 1) or absent 
(ay = 0). The network is simple in that self-interactions 
are not permitted and that the adjacency matrix is sym- 
metric. 

Each network is described by its' probability distribu- 
tion V({a}) — Yli<j ^ij ( a ij)- The link between nodes 
i and j is present with probability p^ — 7Ty(l) and is 
otherwise absent with probability (1 — Pij) = ^(0). 

The ensemble is subject to k = 1 . . . M structural con- 
straints, of the type 



Up) = f k 



(i) 



where f K (p) is a constraint function on the probability 
matrix {p}, which consists of matrix elements p^, and 
F K E R is the constraining value. 

In accordance with the principle of maximal entropy 
[28j j , the link probabilities for this canonical ensemble are 
provided by the maximization of the Shannon entropy of 
network ensembles [j| [l5[ , 

5[p] = -^2 f y( (t ) ln ( 7r 'j( a )) 

i<j a={0,l} 

= -^2\Pij^.Pij + 0--Pij)M 1 -Pij)]> ( 2 ) 

subjected to the constraints of Eq. (fTJ). This optimiza- 
tion exercise gives rise to the maximal entropy canon- 
ical network ensemble, which is a generalization of the 
G(N,p) random network ensemble The marginal 

probabilities p^ are given as the solution to the system 
of equations 
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dp t . 



M \ 

S[p] + 5> K / K (p) =0, 

K=l ) 



(3) 



where the A K £l are Lagrange multipliers enforcing the 
constraints. 



Let us consider the simple case of constraints on the 
expected degree of each node, i.e., we select fcj, such that 
our M = N constraints given by ([1]) take the form 



JY 



/,Pij = h , i = 1, . . . ,N . 



(4) 



The marginal probabilities p^ that satisfy Eq. Q are 



given as 



—\% — Aj 



Pij 



1 



1 



(5) 



with the Lagrange multipliers \ fixed by Eq. and 
the variables 9i — e~ A ', which are commonly referred 
to as "hidden variables" |22h24| . In table U we generalize 
this procedure to network ensemble satisfying a number 
of different structural constraints. 



II. LARGE DEVIATIONS OF CANONICAL 
ENSEMBLES SOLVED BY THE CAVITY 
METHOD 

The constraints for canonical ensembles are satisfied 
only on average, it is therefore relevant to investigate 
the probability of large fluctuations in these ensembles. 
The entropy for large deviations r2[{G re }] of canonical 
ensembles is defined as 



lim — In 

AT-K5Q N 



{a i:j } 



M 



Y[6[ G K -g K (a) 



(6) 



where the delta function 5(. . .) enforce the k = 1, . . . , M 
hard constraint 



g K (a) = G K , 



(7) 



with g n (a) being the constraining function specified on 
the adjacency matrix and G K £ N as the constrained 
value. The quantity f2[{G K }] > measures the probabil- 
ity that networks in a canonical ensembles satisfy Eqs. 
([7]). If fi[{G K }] is large, then this implies that the prob- 
ability that the networks in the canonical ensemble sat- 
isfy the topological constraints is large. Small values of 
f2[{G K }], on the other hand, correspond to the large devi- 
ations of the canonical ensemble, i.e., there the networks 
satisfying the hard constraints are rare. 

The exact calculation of 51f|G K |] has been performed 
using path integral methods [lj, [l6[ with linear hard con- 
straints that fix the degree sequence. 

Using the cavity method, we now demonstrate how 
to compute Eq. ([6]) for more general cases of canonical 
ensemble and hard constraints fixing, for example, (i) 
the degree sequence, (ii) community structure, and (iii) 
number of links at a given distance. 
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Constraints 


Probabilities Pij/ (1 — Pij) 


Conditions 


Given expected 
number of links L 


p/(l-p) 


pN(N -l)/2 = L 


Given expected 
community stxuctm:e{A qq ,} 


W( qi , qj ) 


A(q,q')\ q _, q , = EyPij^i,!^,,' 


Given expected 
degree sequence {ni} 


@i Oj 




Given expected 
degree sequence 
community structure {A(q,q')} 


OMMiq.q,': 


Ki = X/j Pij 
A(q,q')\ q7 , q , = 

A{q,q) = T.iKjPijSq^qSq^q 


Spatial networks 
Given expected 
degree sequence {ni} 
and number of link at distance d £ I S ,B 3 


OiOjWis^) 


K i = Pij 

B ( s ) = Y,ijPHXs(dij) 


Given expected 
degree sequence 
and number of triangles 
for each node {Ti} 




K i — / <j Pij 
Ti ~ X^jfc PijPjkPki 

fij = PikPkj 
9ij = ^Z k PikOtkPkj 



TABLE I: Maximum-entropy networks ensembles with given set of constraints. The probability pij of each link is given 

for network ensembles in which we imposed different types of constraints. This probabilities are expressed in terms of "hidden 
variables" of the ensembles {9i}, W(q,q'), W(d), {a;}, which are determined by the respective "conditions" specified in the 
table. In the network ensembles with given community structure, the community of each node is associated with a Potts 
variable qt — 1, . . . , Q = 0(\ /r N). In the network ensemble embedded in a physical space the distance between the nodes is 
binned in L intervals 7 S G [d s ,d s + Ad s ) and it is indicated by a discrete variable Sij = s if the distance dij between the nodes 
i and j satisfy dij € I 3 . The functions Xs(d) are indicator functions of the intervals I s , i.e. Xs(d) = 0, 1 and Xs(d) = 1 if and 
only if d £ I s . 



In order to apply the cavity method to the calcula- 
tion of fi[{G K }] it is first necessary to describe the factor 
graph we will consider, which is depicted in Fig. [TJ Fol- 
lowing recent efforts to evaluate the number of loops in 
networks [29M3~H we take the variables of the factor graph 
to be the matrix elements ag of the adjacency matrix, 
where index £ = !,..., N(N — l)/2 identifies each possi- 
ble link of the undirected network |33j . The factor nodes, 
which are identified by Greek letters, a = 1,2 ... ,M de- 
note the M topological constraints imposed on the net- 
work. In particular, each factor node a is linked to a list 
of variables, which are identified by the set da. Likewise, 
variable t is connected to a a set of constraints, which 
is indicated as di. In our ensemble we assume that the 
number of constraints connected to a variable t it is equal 
for each variable of the factor graph and given by \dl\. 



i 9 




FIG. 1: Factor graph used for the cavity calculations. The 
variables nodes I are indicated with circles and have a fixed 
connectivity \dl\ — 3 in the figure. The factor nodes, instead 
are indicated by squares. Their role is to impose the hard 
constraints defined in Eqs.([7]). 



The cavity method, remains valid even for M = 0(N 2 ) 
but the scaling M — O(N) is necessary (as will become 
clear in the following derivation) to ensure that the en- 
tropy f2[{G K }] remains finite. 



4 



A. Large deviations of canonical ensembles with 
linear constraints 



of the parameter x at this stage is completely irrelevant 
and in fact the relation 



The constraint given by Eq. ([7]) now fixes the degrees 
of the factor nodes, i.e., 



K n 



E 



(I f: . 



(8) 



with a = 1, . . . , M and factor node degree K a £ N. Cor- 
respondingly, we can write Eq. ([6]) as 



n[{K a }] 



lim — In 



{a e } 



M 



n s [ k <* - e 

a=l \ I'eda 



(9) 



Within the summation term on the first line of Eq. © 
and for each value ag, we introduce the unity identity 
1 = x ai x~ ai , which is parametrized by x > 0. We can 
then define n^[{Jf },x] as 



Vt N [{K a },x] = — — In 



L {<»<} 



M ( 

JJ S I K a ~ Yl a *' 

a=l \ I'eda 

— h\{x) 
N V ' 



(10) 



where L is the total number of distinguishable links con- 
straint by the constraint in Eq. ([8]). The introduction 



n[{K Q }} = lim Q N [{K a },x] 

N—>oo 



(11) 



holds for all values of x. However, in what follows, we 
will focus on the particular limiting case where x tends 
to 0. Thus, we write that 

n[{K a }] = lim lim Q N [{K a },x\. (12) 

x— >0 N— >oo 

The calculation of Qfji^Q,}, x] may be formulated in 
terms of the ca yity method or the Belief Propagation 
(BP) algorithm (26|, [27]], aimed at determining \nZ with 
Z defined as in the following 

Z = ^M^I-k) 1 - (13) 

{a e } 

x n 6 ( k °< - e °A > ( w ) 

a=l \ t'eda I 

where the entropy fl^[{K a }, x] given by 

NSl N [{K a },x] = -hiZ + L\n(x). (15) 
The "messages" of this BP algorithm are sent between 
variable and factor nodes. In particular, we may define 
vi-t a (ai) as the message sent from variable node I to 
factor node a indicating the probability that matrix el- 
ement I takes value an, in absence of constraint a. We 
correspondingly define v a -^g(ae) as the message that the 
factor node a sends to variable t for the distribution of 
all variables connected to a, except variable I. The BP 
update rules [2(| [53] take the form 



D a ^i(a e ) = ^ <5 [ K a 



e j eda\e I i'eda\e 



(16) 
(17) 



where C^ a > are a normalization constants. To pro- messages given by (fT?]) as 
ceed, we make the ansatz that the cavity distribution 
satisfies a binomial form 



(18) 



which is parametrized by fields hg Q £ R. Using integral 



dz 
2V 



-iz[K a —ag] 



n 

i'Gda\ 



l-h e ,, a (l-e iz ) 



(19) 



representations of delta functions, we calculate the cavity Assuming self-consistently that the hg' a are small, we 
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approximate the product in the above equation as 

dz I 



2tt 



exp 



[K a - at] 



h e ,, a (l-e iz )\ 

e'<£da\£ / 



(20) 



which on suitable transformation of variables takes the 
form of Hankel's Contour Integral, giving 



i 1 -, rexp(/i £:Q - V hp, A 

T{K a + l- ai ) \ ^ J 



K a — ai 



(21) 



e'eda 



Finally, inserting the above result into Eq. p^|) . we get 
that hg ja satisfied the recursion equation 



(22) 



Provided that every link exists with probability pi =/= 1, 



we can choose the value of x to be sufficiently small so 
as to approximate hi a by 



h« a — 



n 



(23) 



Since we have assumed that every variable £ is linked to 
exactly \d£\ = 2,3,... factor nodes, the equation ([23]) is 
solved for every value of x <C 1 by the cavity field 



where the cavity field h^ a satisfy the equation 



(24) 



he a — 



PC 



n 



K 



pedi\ 



a {-he,p + Yli'ed/3 h 



(25) 



Equations (|24|) and (|25|) define the cavity distributions 
ht. ta which are indeed small for sufficiently small value 
of x, as previously assumed. Finally, using the BP algo- 
rithm (2(| [27| we can derive the marginal distributions 
for the factor graph which are given by 



P t {a t ) = Ct 1 (j> t x) at Q--pt) 1 - at II vp^t{ai).. 



V I'eda J i'eda L I3edl'e\c 



where CV and C a are normalization constants that satisfy 



Cl = J] *> a ->/(0) 



a est 



Kb 



c « = n^ sa h t , a ){K a ) n o--pt>) n (o) 

e'eda /3edi'\a 



(26) 



r 



The term n x (y) gives the probability for Poisson dis- 
tributed random variable y with average x. Following 
[26l[27j. in terms of our marginal distributions, the quan- 



tity — InZ, with Z defined in Eq. (fl"4"|). may be expressed 
as the minimum of the Bethe free energy 
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G B cthc[{^«}] = E p «(Wi) ln 

a=l {a e } eeBa 



P a (M)\ ^ {m _ 1} ^ P«lnf$MV (27) 



a t = {0,l} 



r 



where \d£\ indicates the number of factor nodes connected that 
to variable £ and 



(28) 



and 



M 



JV(JV— 1)/2 

-mZ = -^lnC* Q + (|<9£|-l) ^ InC*. (30) 



Mat) = (pex) ae (l- Pe ) 



I — a? 



(29) Using the definition of entropy of large deviations Eq. 
(fTTJ|) and the expressions in Eq. ([27]) for C a and Cg, 
Inserting the expression for the marginal distributions, together with the Eqs. (|2~31 - H 25|) for the cavity fields, we 
Eq. flUJ}, into Eq. (J27]) we obtain the result [H [H, H3| get, for x < 1, that 

I 



M N(N~l)/2 

NQ N [{K a },x] = -^lnC Q + (\d£\ - 1) hiQ + iln(a;) 

l 

Kn 



a=l 
M 



^ *VI«I 2 exp £ ^ tt II (1 



(Gf)a 



JV(JV-l)/2 

(i^i - 1) J2 ln 



n 



K, 



I 



+ 1 -Pi 



L\n(x) . 



(31) 



Finally, going in the limit x — > and AT — > oo we get, according to Eq. (fTTj) 
I 



I I M 

1 a=l 



Af Q 



Me^ 



\eeda 



N(N-l)/2 

(\d£\ - 1) £ In 



A". 
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I 



+ 1-Pt 



\dt\Y,Hi-pt)\, 



(32) 



where the cavity field hg^ a are the solution of the BP Eq. further. 



B. Specific hard constraints 



1, Degree sequence 



We now consider a few specific cases for the hard con- 
straints, which allow us to simply our expression Eq 



Also known as the configuration model [2l|, we 
consider constraints that fix the degree sequence, 
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(kx,k 2 , ■ 



, fc/v) g N w for the network, where 

JV 



(33) 



with i = 1, ...,N. In terms of the factor graph, each 
factor node a fixes the degree for a specific node i in the 
undirected network. Recalling that variable I represents 
the tuple in the adjacency matrix, the variable is 

linked to \d£\ = 2 constraints that fix the degrees for 
nodes i and j. Finally the cavity fields hg^ a can be written 
as hj^, as we have identified factor node a with node 
index i and, similarly, variable I with nodes i and j. 

To simplify the expression Eq. (|32p for SI [{£;;}] we 
introduce 



JV 



(34) 



Using Eq. (|25|) it is easy to show that the variables {7^} 
satisfy the following equation 



jv . 



?W lj 



(35) 



where ft^j is given by the solution to Eq. ([25]) . Finally, 
in the limit x — ¥ 0, we get the exact result for the entropy 
of the large deviation of canonical network ensembles to 
be 



Sl[{ki}] = lim 1 



E ln 



JV 



E ln 



1 



(36) 



' Pij 7»7j 



-£>(1"P*) 



where indicates the sum over all links in the adja- 
cency matrix. If hj i <C Jt Eq. Q35[) simplifies to 



JV 



7i 



Pt_kj_ 



(37) 



which then gives in the diluted limit pi « 1 the result 
[II EH that 



iV 



lim 

JV->oo 



E ln 

. i=i 



hi -hi 

7i e 



, (38) 



for the configuration model. 



i belongs. In addition to the degree constraint given by 
Eq. (j3"3"|) , we also impose on the level of the adjacency 
matrix that 



JV 

Ed 

Kj = l 



2$q,q')K<li S <l',<li a ij > (39) 



where q < = 1, . . . Q. The total number of constraints 
is M = N + Q(Q-l)/2. 

Each variable node I in our factor graph is now linked 
to three factor nodes - two for constraining the degrees 
of nodes i and j, separately, in the undirected network 
and a third one to enforce the community structure qi, qj. 
Similarly to the previous case we introduce 



7a 



E ^ ie ' a ' 



(40) 



where a E {i,j, {qi, qj)} indicated the type of constraint. 
Given the cavity Eqs. (|25[) . it can be shown that the 
variables {7^} satisfy the following equation 



7a 



n 



K 6 



/3ede\, 



7/3 — ht t p 



(41) 



where Kp G {fcj, kj, A(qi, qj)}, depending on the value of 
a and the cavity fields hg a satisfy the cavity Eqs. (|25p . 
The entropy of large deviations fi[{i\ Q }] given by (|3"2")l 
can be expressed as 



M 



n[{K a }} 



lim — 

JV-i-oo N 



JV 



^l n [7T 7<y (if a )] 
a=l 

ki kj A(qi, qj) 

7i7i7fe,g^) 



^7a-2^1n(l- P£ )l 

a £ J 



(42) 



In the case in which ht t p <C 7/3 and the network is diluted 
pe < 1 we get 



lim l/-Vln 

JV^oo jV I ^ 



a = l 



J_ K a -K a 
K \ 



(43) 



The value of ri[{AT Q }] converges to a finite value in the 
limit of N — y 00 only if the number of constraints M is 
of the same order of magnitude as N, i.e., M = O(N), in 
other words if the number of communities Q = 0(y/~N). 
For M = 0(iV«), with £ G (1, 2), we have that ~ 
iV?" 1 . 



2. Community structure and degree sequence 



3. Links at a given distance and degree sequence 



Suppose we assign to node i a Pott's index q^ — 
1 , . . . , Q that indicates the community to which the node 



Let us embed the N nodes in a metric space, such that 
two nodes i and j are a distance dij < D apart. We 
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divide the interval [0,D] into L — O(N) intervals I s = 
[d s , d s + Ad s ) with s = 1, 2, . . . , L and d s+ i = d s + Ad s . 
The constraint for the number of links at a given distance 
is given by specifying a sequence of integers B\ , B 2 , ■ ■ ■ Br, 
that satisfy 



B, 



N 

E 



X s {dij)a i: j 



(44) 



where Xs(dij) — 1 if dij G / s and Xs(dij) = otherwise. 
The total number of constraints is in this case M — N + 
L. 

Once again each variable £ is linked to \d£\ = 3 factor 
node constraints - two for fixing the degrees of node i and 
j and a third for the number of links B s in the interval / s . 
We introduce the variables 7 Q accoring to the definition 



7a = E 

lEda 



(45) 



with a £ {i,j,Sij}. These parameter satisfy the follow- 
ing equation 



7a 



St 



Pi 



n 



(46) 



where the cavity field solve the cavity equation (1231) . The 
entropy of large deviations i7[{AT Q }] given by (|32|) can be 
expressed as 



degrees K a of sparse networks are the expected degrees 
over the canonical ensembles, i.e., K a = YltedaPt = ^a- 
The BP equations simplify to give 

h t , a = Pi ■ (49) 



Thus, Eq. d32j) reduces to 

1 M 

Sl[{k a }] = - lim - Y\nir ka (k a ). (50) 

Q = l 

Since this is the minimum value of f2, we obtain that, for 
AI = O(N), the limit limAr^oo > and therefore the 
canonical ensemble is not self-averaging in the thermo- 
dynamic limit. 



1. Degree sequence 



In the situation wherein only the degree sequence of 
the network is constrained, we have K{ = Y^jLi Pij = k%, 
for all i = 1, . . . , N. The entropy f2[{fcj}] of the expected 
degrees in the configuration model f2[{fcj}] takes the form 



"[{Ml =-^Pfeln[7r fe (fc)] 



(51) 



fc>0 



where p k is the probability for a node to have degree k. 



2. Community structure and degree sequence 



n[{K a }} 



lim — < 

N^oo N I 
N 

\- Pij 



M 

E 

a=l 



hi kj B Si 



ij^i 1 - p v V 7^ li 7-ij 
5^7a-2 5^Iog(l-pi)l 

a i ) 



(47) 



where the subscript Sjj denotes the interval s such that 
dij e I s . Using (l46t in the limit of spase networks with 
Pi <C 1 the entropy of large deviations simplifies and 
takes the form 



n[{K a }} 



i 



M 

lim — ( — > In 

AT-s-oo N 



a=l 



1 

K I 



The value of f2[{A Q }] converges to a finite limit for 
N — > oo only is the number of constraints M is of the 
same order of magnitude as N, i.e. M — O(N), i.e. only 
if L = O(N) If M ~ A« with £ € (1, 2) then ^[{A Q }] ~ 



As in Sec III B 2l each node i is assigned a Pott's index 
qi = 1, . . . , Q that indicates the community to which the 
node belongs, with Q = 0(y/~N). The expected degree 
constraints take the form 

N 

h = =X>i. (52) 

3=1 

= E( 1_ 2 S Q,<l') S q,<li S <l',<liPij > ( 53 ) 

i<j 

for i = 1, . . . , N and q < q' = 1, . . . Q The total number of 
constraints is in this case M = N + Q(Q-l)/2 = 0(N). 
The entropy i7[{fci}, {A(q, q')}] takes the value 



n[{h},{A(q,q')}} = -J^PkHMk)} 



(54) 



fe>0 



Q 



q<q' 



3. Links at a given distance and degree sequence 



C. Special case for constraining degrees in sparse 
networks 

Further simplifications for the expressions obtained in 
the previous section are possible when the constraining 



Following the setup of Sec. Ill B 3| the constraints in 
terms of expected degrees are given by Eq. (|52p and 



N 

E 

i<j 



Xs(d tj )pi 



(55) 
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where i = 1, . . . , N and s = 1, . . . L and where Xs{dij) = 
1 if dij € I s and Xs (dij ) = otherwise. We now express 
n[{ki},{B p }} as 



1 

n[{ki},{B p }] = -J>kM**(*)]- jim^-^H^^)] 

(56) 



fe>0 



s=l 



III. ENTROPY OF SIMPLE 
MICROCANONICAL NETWORK ENSEMBLES 

So far we have investigated the entropy of sim- 
ple canonical network ensembles and large deviations 
therein. In this section we derive an expression for the 
entropy E of a micro-canonical ensemble with linear con- 
straints. Moreover, using the result of Eq. (1521) we relate 
E to the entropy 57 of the most likely configuration of a 
canonical ensemble when linear constraints are imposed. 

Specifying k — 1 . . . , M hard constraints on the adja- 
cency matrix, as in Eq. (J7]) , the micro-canonical ensem- 
bles' entropy E is given by 



E = lim — In Zn . 



where the partition function Zn is given by 

M 



{dij } re=l 



(57) 



(58) 



In what follows we shall prove the following relationship, 
that 



E = S*[P]-ft*({G K }), 



(59) 



where S* [P] , given by Eq. ^ , is the Shannon entropy of 
the conjugated canonical ensemble. The term f2*[{G K }] 
is the logarithm of the probability that a network in 
the conjugated canonical ensemble satisfies the hard con- 
straints. 

Physically, Eq. (l59l) implies that a network satisfying 
the hard constraints of Eq. ([7]) belongs, with probabil- 
ity one, to the conjugated canonical ensemble. However, 
such networks make up only a fraction e™*'^"}' of the 
total canonical ensemble measure. 



A. Proof of correspondence between canonical and 
micro-canonical entropies 



We now prove the relationship of Eq. (I59I) . for the 
case of hard constraints specifying the degree sequence. 
In order to evaluate ([58]) in this case we use the integral 
representation of the Dirac-delta functions, and we get 



Zn = 



N 

n 



2tt 



exp 




^ln[l + e iWl+i ^] 



i<j 



(60) 



where with the change a variables Zi — uji — lu* 



r N a 



(61) 



with 

F N ({z,u*}) 



N 



l + e 



i<3 



1+Pij (e iz * +iz > -1 



(62) 



and the ui* variables chosen so as to satisfy the marginal 
probabilities for the canonical ensemble, i.e., 



Pij 



l + e" 



(63) 



We observe that Eq. f[52|) can be expressed 
F N ({z,w*}) = S[P,{uJ*}}-^iZik 

i 

+ J2 Ml +Pij(e iZi+iz * -1)) . (64) 



Therefore, with simple manipulations it can be shown 
that the partition function can be written as 



iV 



{a i:j } 



exp 



N[s*[r] + n N [{ki},i]} 



(65) 



Given the definition (|57[) . this proves the relationship Eq. 
(|59p between entropies of micro-canonical and conjugate 
canonical ensembles. 



B. Special cases for constraining degrees 



Following the simplification of Sec. Ill CI we assume 
that the constraining degrees K a are expectation values 
of the canonical ensemble. Using Eq. (|59"|) we get 



E = S*[V] - Cl*[{k a }] 



(66) 



where f2[{fc Q }] is given by (|3"2"j) where k a are the expected 
degree of the canonical ensembles. For sparse networks, 
we can use Eq. (|50p and E take the simple form 

E = S*[V] + (\dt\ — l)5> fc ln^.(fc). (67) 
fe>o 

where n& is the probability that a random constraint en- 
force the degree k. 
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We note that when using a Gaussian approximation 
[lU, [l3| for network models with linear constraints, the 
value for the entropy Eg obtained for the micro-canonical 
ensembles is reasonably good, with an estimated error 
equal to 



< 



A I 



a=l 



M , 

77 



I In e 



'2d 



k I 

_ M . 

" 77 



0.08. 



We conclude this section with the expressions for £ for a 
few specific constraints. 



3. Links at a given distance and degree sequence 



When the constraints are on the number of links at a 
specific distance and the degree sequence, the expression 
for the entropy of the micro-canonical ensemble takes the 
form 



1 L 

(68) E = S[P] + yVlog[7r fc (fc)]+ lim - V log[7r Ss (B. 

— ' iv— »oo 1\ £ — ' 



k>0 



s=l 



where B s is given by Eq. (|55p valid for sparse networks. 
IV. CONCLUSIONS 



1, Degree sequence 

From Eq. (j66]) we get in the case of the sparse config- 
uration model 

E = S*[P]+5> fc hiMfc)], (69) 

fe>0 

where pk is the probability of observing a node with de- 
gree k. 

2. Community structure and degree sequence 

In the ensemble with given degree sequence and a con- 
straint on the number of links within and between com- 
munities q = 1, . . . , Q, for the total number of communi- 
ties Q = 0(VN). Here we obtain for sparse networks 

E = S*[P} + J2pM7Tk(k)] 

fc>0 

1 Q 

+ £ iogK v 0V0], (70) 

9,<«' = 1 

where A q ^ q i is given by Eq. (|59"j) . 



In conclusion we have derived exact results for the large 
deviation properties of canonical network ensembles and 
for the entropy of micro-canonical network ensembles in 
the case of simple networks with linear constraints. 

Our results apply to simple networks with given degree 
sequence, community structure and for networks embed- 
ded in a metric space. Our approach makes use of the 
transparent cavity method, which can also be extended 
to other types of constraints or directed networks. 

Our calculations are valid even when the number of 
constraints scales like M = 0(N 2 ). Nevertheless, only 
in the case of a linear number of constraints, i.e., M — 
O(N), can we ensure that the entropy ^{ifo,}] remains 
finite in the limit N — > oo. 

Further inquiry will be directed toward the exact eval- 
uation of the entropy of weighted networks and networks 
wherein the number of loops passing through each node 
is constrained. Moreover, the relation between the infor- 
mation entropy of networks ensembles studied here and 
the von Neumann entropy, as introduced in [32j , presents 
further scope for investigation. 

G. B. acknowledge stimulating discussions with A. An- 
nibale and A.C.C. Coolen. 
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